Partially lexicalized parsing model utilizing rich features

نویسندگان

So-Young Park

Yong-Jae Kwak

Joon-Ho Lim

Hae-Chang Rim

Soo-Hong Kim

چکیده

In this paper, we propose a partially lexicalized parsing model utilizing rich features to improve the parsing ability and reduce the parsing cost. In order to disambiguate parse trees effectively, it employs several useful features such as a syntactic label feature, a content feature, a functional feature, and a size feature. Besides, it is partially lexicalized so as to reduce the parsing cost closely connected with lexical information. Moreover, it is designed to be suitable for representing word order variation and constituent ellipsis in Korean sentences. Experimental results show that the proposed parsing model using more features performs better although it less depends on lexical information.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Scalable Discriminative Parsing for German

Generative lexicalized parsing models, which are the mainstay for probabilistic parsing of English, do not perform as well when applied to languages with different language-specific properties such as free(r) word order or rich morphology. For German and other non-English languages, linguistically motivated complex treebank transformations have been shown to improve performance within the frame...

متن کامل

Multilingual discriminative lexicalized phrase structure parsing

We provide a generalization of discriminative lexicalized shift reduce parsing techniques for phrase structure grammar to a wide range of morphologically rich languages. The model is efficient and outperforms recent strong baselines on almost all languages considered. It takes advantage of a dependency based modelling of morphology and a shallow modelling of constituency boundaries.

متن کامل

TOWARDS EFFICIENT STATISTICAL PARSING USING LEXICALIZED GRAMMATICAL INFORMATION by

For a long time, the goal of wide-coverage natural language parsers had remained elusive. Much progress has been made recently, however, with the development of lexicalized statistical models of natural language parsing. Although lexicalized tree adjoining grammar (TAG) is a lexicalized grammatical formalism whose development predates these recent advances, its application in lexicalized statis...

متن کامل

Cross Parser Evaluation and Tagset Variation : a French Treebank Study

This paper presents preliminary investigations on the statistical parsing of French by bringing a complete evaluation on French data of the main probabilistic lexicalized and unlexicalized parsers first designed on the Penn Treebank. We adapted the parsers on the two existing treebanks of French (Abeillé et al., 2003; Schluter and van Genabith, 2007). To our knowledge, mostly all of the results...

متن کامل